Robot Audition – Hands - Free Automatic Speech Recognition under Highly - Noisy Environemnts – Kazuhiro NAKADAI
نویسنده
چکیده
This paper addresses robot audition, which realizes listening capabilities for robots using robot-embedded microphones. For robot audition, we propose real-time sound source separation and automatic speech recognition (ASR) techniques for dynamically changing environments based on microphone array processing, which is applicable to hands-free ASR under highly-noisy environments. Implementation of the proposed techniques is open-sourced as robot audition software called “HARK.” We show the effectiveness of these techniques through applications of HARK to robots.
منابع مشابه
Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots
Automatic Speech Recognition (ASR) which plays an important role in human-robot interaction should be noise-robust because robots are expected to work in noisy environments. Audio-Visual (AV) integration is one of the key ideas to improve the robustness in such environments. This paper proposes two-layered AV integration for ASR which applies AV integration to Voice Activity Detection (VAD) and...
متن کاملAudio-visual speech recognition system for a robot
Automatic Speech Recognition (ASR) for a robot should be robust for noises because a robot works in noisy environments. Audio-Visual (AV) integration is one of the key ideas to improve its robustness in such environments. This paper proposes AV integration for an ASR system for a robot which applies AV integration to Voice Activity Detection (VAD) and speech decoding. In VAD, we apply AV-integr...
متن کاملDesign and Implementation of Robot Audition System 'HARK' - Open Source Software for Listening to Three Simultaneous Speakers
This paper presents the design and implementation of the HARK robot audition software system consisting of sound source localization modules, sound source separation modules and automatic speech recognition modules of separated speech signals that works on any robot with any microphone configuration. Since a robot with ears may be deployed to various auditory environments, the robot audition sy...
متن کاملHands-free Speech Recognition Robust to distance and Azimuth in Robot Application
In this paper we present two methods in addressing the changes in radial position and azimuth, respectively, relative to the robot and speaker. In the case of the former, room transfer function (RTF) estimation is employed via waveformlevel compensation to reflect the change in power caused by the change of radial position to the RTF. In addition, acoustic model-level compensation is also used ...
متن کاملLeak energy based missing feature mask generation for ICA and GSS and its evaluation with simultaneous speech recognition
This paper addresses automatic speech recognition (ASR) for robots integrated with sound source separation (SSS) by using leak noise based missing feature mask generation. The missing feature theory (MFT) is a promising approach to improve noise-robustness of ASR. An issue in MFT-based ASR is automatic generation of the missing feature mask. To improve robot audition, we applied this theory to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011